MNIST Neural Network Trainer

Network Architecture

First Hidden Layer Size

16 64 128

Activation Function

Dropout Rate

Second Hidden Layer Size

8 32 64

Activation Function

Dropout Rate

Training Epochs

3 5 15

Training Status Not Trained

Test Your Model

Draw a digit (0-9) below. The prediction updates automatically.

28x28 input

Prediction

About This Demo

This demo trains a neural network on the MNIST dataset, which contains 60,000 training images and 10,000 test images of handwritten digits (0-9). Each image is 28x28 pixels in grayscale.

The network is a fully-connected feedforward network: the 784 input pixels are flattened into a vector, passed through two hidden layers with configurable size and activation function, then into a 10-neuron output layer with softmax activation that produces a probability distribution over the digits.

Try changing the architecture and retraining to see how it affects accuracy. Smaller networks learn faster but may underfit; larger networks may overfit. Dropout randomly disables neurons during training as a regularization technique.